Does Category A Anchor Text Improve Category B Results?
نویسنده
چکیده
Associating anchor text with pages, to which links are pointing, is a well-known approach to improve retrieval quality. It was used in the first version of Google [Brin and Page 1998]. On one hand, using the anchor text alone allows one to obtain a system with decent performance [Anh and Moffat 2010; Hiemstra and Hauff 2010]. We also know that the anchor text is a strong relevance signal from our own experiments in TREC 2011 [Boytsov and Belova 2011]. On the other hand, the size of the anchor text is much smaller than size of the text for a full collection. Thus, enriching the Category B index (built over 50M documents) with the Category A anchor text index (built over 370M short documents), seemed to be an appealing method of improving performance at little cost.
منابع مشابه
Using Anchor Text, Spam Filtering and Wikipedia for Web Search and Entity Ranking
In this paper, we document our efforts in participating to the TREC 2010 Entity Ranking and Web Tracks. We had multiple aims: For the Web Track we wanted to compare the effectiveness of anchor text of the category A and B collections and the impact of global document quality measures such as PageRank and spam scores. For the Entity Ranking Track, we use Wikipedia as a pivot to find relevant ent...
متن کاملSelecting the Right Cause from the Right Category: Does the Role of Product Category Matter in Cause-Brand Alliance? A Case Study of Students in Shanghai Universities
Increased competition is making it difficult to distinguish products solely by attributes, creating room for cause-related marketing. In this study with a sample of 322 university students, we evaluated the changes in consumer attitudes toward cause and brand as consequences of Cause Brand Alliance (CBA), by using the product category as moderator. Four popular brands from two product categorie...
متن کاملUsing anchor text for homepage and topic distillation search tasks
Past work suggests that anchor text is a good source of evidence that can be used to improve web searching. Two approaches for making use of this evidence include fusing search results from an anchor text representation and the original text representation based on a document’s relevance score or rank position, and combining term frequency from both representations during the retrieval process....
متن کاملDoes routine repeat testing of critical laboratory values improve their accuracy?
Background: Routine repeat testing of critical laboratory values is very common these days to increase their accuracy and to avoid reporting false or infeasible results. We figure that repeat testing of critical laboratory values has any benefits or not. Methods : We examined 2233 repeated critical laboratory values in 13 different hematology and chemistry tests including: hemoglobin, white...
متن کاملAd Hoc and Diversity Retrieval at the University of Delaware
We indexed ClueWeb using the Indri retrieval engine [6]. Due to disk space constraints, we elected to use the Category B subset of 50 million English-language web pages only. We indexed the full documents. We included field information such as title, headings, and bold/italic markup, and dropped script and style tags. We did not index anchor text. We used the Krovetz stemmer and a simple stopwo...
متن کامل